WHAT IS CLAIMED IS: 

1. A system for compressing a data table, comprising: 

2 a table modeller that discovers data mining models with 

3 guaranteed error bounds of at least one attribute in said data 

4 table in terms of other attributes in said data table; and 

5 a model selector, associated with said table modeller, that 

6 selects a subset of said at least one model to form a basis upon 
7 S „ which to compress said data table. 

W 2. The system as recited in Claim 1 wherein said table 

2*t modeller employs classification and regression tree data mining 

TSS? 

3^0 models to model said at least one attribute. 

fy 3 . The system as recited in Claim 1 wherein said model 

20 selector employs a Bayesian network built on said at least one 
3 attribute to select relevant models for table compression. 

4. The system as recited in Claim 2 wherein construction of 

2 said models uses integrated building and pruning to exploit 

3 specified error bounds and decrease model construction time. 

5. The system as recited in Claim 1 wherein said table 

2 modeller employs a selected one of a constraint-based and a 

3 scoring-based method to generate said at least one model. 
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6 . The system as recited in Claim 1 wherein said model 

2 selector selects said subset based upon a compression ratio and an 

3 error bound specific for each attribute of said data table, 

7. The system as recited in Claim 1 wherein the process by 
2 which said model selector selects said subset is NP-hard. 

8. The system as recited in Claim 1 wherein said model 
2 selector selects said subset using a model built on attributes of 

ILjl, 

3D said data table by a selected one of : 

4M repeated calls to a maximum independent set solution 

JN» algorithm, and 

8g a greedy search algorithm. 
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9. A method of compressing a data table, comprising: 

2 discovering data mining models with guaranteed error bounds of 

3 at least one attribute in said data table in terms of other 

4 attributes in said data table; and 

5 selecting a subset of said at least one model to form a basis 

6 upon which to compress said data table. 

10. The method as recited in Claim 9 wherein said discovering 
2 5 s comprises employing classification and regression tree data mining 
3rf models to model said at least one attribute. 

^7 11. The method as recited in Claim 9 wherein said discovering 

3D comprises employing a Bayesian network built on said at least one 
J* attribute to select relevant models for table compression. 

SB 

p 12. The method as recited in Claim 10 further comprising 

*2 using integrated building and pruning to exploit specified error 

3 bounds and decrease model construction time. 

13 . The method as recited in Claim 9 wherein said discovering 

2 comprises employing a selected one of a constraint-based and a 

3 scoring-based method to generate said at least one model. 

14. The method as recited in Claim 9 wherein said selecting 

2 comprises selecting said subset based upon a compression ratio and 

3 an error bound specific for each attribute of said data table. 
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15. The method as recited in Claim 9 wherein said selecting 
2 is NP-hard. 



16. The method as recited in Claim 9 wherein said selecting 

2 comprises selecting said subset using a model built on attributes 

3 of said data table by a selected one of: 

4 repeated calls to a maximum independent set solution 

5 algorithm, and 

6pl a greedy search algorithm. 
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17. A database management system, comprising: 

2 a data structure having at least one data table therein; 

3 a database controller for allowing data to be provided to and 

4 extracted from said data structure; and 

5 a system for compressing said at least one data table, 

6 including: 

7 a table modeller that discovers data mining models with 

8 guaranteed error bounds of at least one attribute in said data 

9 table in terms of other attributes in said data table, and 

10 p a model selector, associated with said table modeller, 

ri 

l£j that selects a subset of said at least one model to form a 

I3I basis upon which to compress said data table. 

Li: ; 
. rFi 

!\ 18. The system as recited in Claim 17 wherein said table 

modeller employs classification and regression tree data mining 
ffi models to model said at least one attribute. 

19. The system as recited in Claim 17 wherein said model 

2 selector employs a Bayesian network built on said at least one 

3 attribute to select relevant models for table compression. 

20. The system as recited in Claim 18 wherein construction of 

2 said models uses integrated building and pruning to exploit 

3 specified error bounds and decrease model construction time. 
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21. The system as recited in Claim 17 wherein said table 

2 modeller employs a selected one of a constraint -based and a 

3 scoring-based method to generate said at least one model . 

22. The system as recited in Claim 17 wherein said model 

2 selector selects said subset based upon a compression ratio and an 

3 error bound specific for each attribute of said data table. 

23. The system as recited in Claim 17 wherein the process by 

2=3 which said model selector selects said subset is NP-hard. 

O 

yj 

H 24. The system as recited in Claim 17 wherein said model 

20 selector selects said subset using a model built on attributes of 

ffe said data table by a selected one of : 

repeated calls to a maximum independent set solution 

sK algorithm, and 

s 6 a greedy search algorithm. 
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